A Joint System for Single-Person 2D-Face and 3D-Head Tracking in CHIL Seminars
نویسندگان
چکیده
We present the IBM systems submitted and evaluated within the CLEAR’06 evaluation campaign for the tasks of single person visual 3D tracking (localization) and 2D face tracking on CHIL seminar data. The two systems are significantly inter-connected to justify their presentation within a single paper as a joint vision system for single person 2D-face and 3D-head tracking, suitable for smart room environments with multiple synchronized, calibrated, stationary cameras. Indeed, in the developed system, face detection plays a pivotal role in 3D person tracking, being employed both in system initialization as well as in detecting possible tracking drift. Similarly, 3D person tracking determines the 2D frame regions where a face detector is subsequently applied. The joint system consists of a number of components that employ detection and tracking algorithms, some of which operate on input from all four corner cameras of the CHIL smart rooms, while others select and utilize two out of the four available cameras. Main system highlights constitute the use of AdaBoost-like multi-pose face detectors, a spatio-temporal dynamic programming algorithm to initialize 3D location hypotheses, and an adaptive subspace learning based tracking scheme with a forgetting mechanism as a means to reduce tracking drift. The system is benchmarked on the CLEAR’06 CHIL seminar database, consisting of 26 lecture segments recorded inside the smart rooms of the UKA and ITC CHIL partners. Its resulting 3D single-person tracking performance is 86% accuracy with a precision of 88 mm, whereas the achieved face tracking score is 54% correct with 37% wrong detections and 19% misses. In terms of speed, an inefficient system implementation runs at about 2 fps on a P4 2.8 GHz desktop.
منابع مشابه
Speaker Tracking in Seminars by Human Body Detection
This paper presents evaluation results of a method for tracking speakers in seminars from multiple cameras. First, 2D human tracking and detection is done for each view. Then, 2D locations are converted to 3D based on the calibration parameters. Finally, cues from multiple cameras are integrated in a incremental way to refine the trajectories. We have developed two multi-view integration method...
متن کامل3D Facial Landmark Tracking and Facial Expression Recognition
In this paper, we address the challenging computer vision problem of obtaining a reliable facial expression analysis from a naturally interacting person. We propose a system that combines a 3D generic face model, 3D head tracking, and 2D tracker to track facial landmarks and recognize expressions. First, we extract facial landmarks from a neutral frontal face, and then we deform a 3D generic fa...
متن کاملJoint face and head tracking inside multi-camera smart rooms
The paper introduces a novel detection and tracking system that provides both frame-view and world-coordinate human location information, based on video from multiple synchronized and calibrated cameras with overlapping fields of view. The system is developed and evaluated for the specific scenario of a seminar lecturer presenting in front of an audience inside a “smart room”, its aim being to ...
متن کاملHybridization of Facial Features and Use of Multi Modal Information for 3D Face Recognition
Despite of achieving good performance in controlled environment, the conventional 3D face recognition systems still encounter problems in handling the large variations in lighting conditions, facial expression and head pose The humans use the hybrid approach to recognize faces and therefore in this proposed method the human face recognition ability is incorporated by combining global and local ...
متن کاملA Comparison of Multicamera Person-Tracking Algorithms
In this paper, we present a comparison of four novel algorithms that have been applied to the tracking of people in an indoor scenario. Tracking is carried out in 3D or 2D (ground plane) to provide position information for a variety of surveillance, HCI or meeting-support services. The algorithms, based on background subtraction, face detection, particle filter feature-matching and edge alignme...
متن کامل